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Abstract 

The design of embedded control systems is mainly done with model-based tools such as Matlab/Simu- 
link. Numerical simulation is the central technique of development and verification of such tools. 
Floating-point arithmetic, which is well-known to only provide approximated results, is omnipresent 
' in this activity. In order to validate the behaviors of numerical simulations using abstract interpretation- 

I based static analysis, we present, theoretically and with experiments, a new partially relational abstract 

domain dedicated to floating-point variables. It comes from interval expansion of non-linear functions us- 
ing slopes and it is able to mimic all the behaviors of the floating-point arithmetic. Hence it is adapted to 
prove the absence of run-time errors or to analyze the numerical precision of embedded control systems. 
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; 1 Introduction 

(N 

I Embedded control systems are made of a software and a physical environment which aim at continuously 

. interact with each other. The design of such systems is usually realized with the model-based paradigm. 

Matlab/SimulinlQ is one of the most used tools for this purpose. It offers a convenient way to describe 
the software and the physical environment in an unified formalism. In order to verify that the control law, 
^) ' implemented in the software, fits the specification of the system, several numerical simulations are made under 

Matlab/Simulink. Nevertheless, this method is closer to test-based method than formal proof. Moreover, 
this verification method is strongly related to the floating-point arithmetic which provides approximated 
results. 

Our goal is the use of abstract interpretation-based static analysis [9] to validate the design of control 
embedded software described in Matlab/Simulink. In our previous work [3], we defined an analysis to validate 
that the behaviors given by numerical simulations are close to the exact mathematical behaviors. It was 
based on an interval abstraction of floating-point numbers which may produce too coarse results. In this 
article, our work is focused on a tight representation of the behaviors of the floating-point arithmetic in order 
to increase the precision of the analysis of Matlab/Simulink models. 

To emphasize the poor mathematical properties of the floating-point arithmetic, let us consider the sum 
of numbers given in Example [T] with a single precision floating-point arithmetic. The result of this sum is 
—2.08616257.10"® due to rounding errors, whereas the exact mathematical result is zero. 

Example 1. 

0.0007 + (-0.0097) + 0.0738 + (-0.3122) + 0.7102 + (-0.5709) + (-1.0953) 

+ 3.3002 + (-2.9619) + (-0.2353) + 2.4214 + (-1.7331) + 0.4121 



o 
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Example [T] shows that the summation of floating-point numbers is a very ill-conditioned problem ^51 
Chap. 6]. Indeed, small perturbations on the elements to sum produce a floating-point result which could 
be far from the exact result. Nevertheless, it is a very common operation in control embedded software. In 
particular, it is used in filtering algorithms or in regulation processes, such as for example in regulation. 
Remark that depending on the case, the rounding errors may stay insignificant and the behaviors of floating- 
point arithmetic may be safe. In consequence, a semantic model of this arithmetic could be used to prove 
the behaviors of embedded control software using floating-point numbers. 

The definition of abstract numerical domains for floating-point numbers is usually based on rational or 
real numbers [131124] to cope with the poor mathematical structure of the floating-point set. In consequence, 
these domains give an over-approximation of the floating-point behaviors. This is because they do not bring 
information about the kind of numerical instability appearing during computations. We underline that our 
goal is not interested in computing the rounding errors but the floating-point result. In others words, we 
want to compute the bounds of floating-point variables without considering the numerical quality of these 
bounds. 

Our main contribution is the definition of a new numerical abstract domain, called Floating-Point Slopes 
(FPS), dedicated to the study of floating-point numbers. It is based on interval expansion of non-linear 
functions named interval slopes introduced by Krawczyk and Neumaier [20) and, as we will show in this 
article, it is a partially relational domain. The main difference is that, in Proposition [H we adapt the 
interval slopes to deal with floating-point numbers. Moreover, we are able to tightly represent the behaviors 
of floating-point arithmetic with our domain. A few cases studies will show the practical use of our domain. 
Hence wc can prove properties on programs taking into account the behaviors of the floating-point arithmetic 
such that the absence of run-time errors or, by combining it with other domains e.g. ,4 , the quality of 
numerical computations. 

Content. In Section [2l we will present the main features of floating-point arithmetic and we will also 
introduce the interval expansions of functions. We will present our abstract domain FPS in Section [3] and 
the analysis of floating-point programs in Section [4] before describing experimental results in Section O In 
Section |B1 we will reference the related work before concluding in Section [71 

2 Background 

We recall the main features of the IEEE754-2008 standard of floating-point arithmetic in Section l^TTl Next in 
Section [2. 2 [ we present some results from interval analysis, in particular the interval expansion of functions. 

2.1 Floating-Point Arithmetic 

We briefly present the floating-point arithmetic, more details are available in [23 and the references therein. 
The IEEE754-2008 standard [18] defines the floating-point arithmetic in base 2 which is used in almost every 
computeJl. 

Floating-point numbers have the following form: / = s.m.2'^. The value s represents the sign, the value 
m is the significand represented with p bits and the value e is the exponent of the floating-point number / 
which belongs into the interval [cmin, Cmax] such that emax = —Smin + 1- There are two kinds of numbers in 
this representation. Normalize numbers for which the significand implicitly starts with a 1 and denormalized 
numbers that implicitly starts with a 0. The later are used to gain accuracy around zero by slowly degrading 
the precision. 

The standard defines different values of p and emin: p — 24 and eaun — —126 for the single precision and 
p — 53 and Cmin = —1022 for the double precision. We call normal range the set of absolute real values in 
[2*^""", (2 — 2^~P)2'^''^'^^] and the subnormal range the set of numbers in [0, 2'^'"'"[. 

^PID stands for proportional-integral-derivativc. It is a generic method of feedback loop control widely used in industry. 
^It also defines this arithmetic in base 10 but it is not relevant for our purpose. 
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The set of floating-point numbers (single or double precision) is represented by F which is closed under 
negation. A few special values represent special cases: the values — oo and +00 to represent the negative or 



the positive overflow; and the value A'^aAo represents invalid results such that 

The standard defines round-off functions which convert exact real numbers into floating-point numbers. 
Wc arc mainly concerned by the rounding to the nearest ties to eveiH (noted fl), the rounding towards -l-oo 
and rounding toward —00. The round-off functions follow the correct rounding property, i.e. the result of 
a floating-point operation is the same that the rounding of the exact mathematical result. Note that these 
functions are monotone. We are interested in this article by computing the range of floating-point variables 
rounded to the nearest which is the default mode of rounding in computers. 

A property of the round-off function fl is given in Equation ([T}. It characterizes the overflow, i.e. the 
rounding result is greater than the biggest element of F and the case of the generation of 0. This definition 
only uses positive numbers, using the symmetry property of F, we can easily deduce the definition for the 
negative part. We denote by cr = 2*^'°'""''+^ the smallest positive subnormal number and the largest finite 
floating-point number by S = (2 — 2^~p)2'^"''''. 



An underflow |2S1 Sect. 2.3] is detected when the rounding result is less than 2'^""°, i.e. the result is in the 
subnormal range. 

The errors associated to a correct rounding is defined in Equation ([2]) and it is valid for all floating-point 
numbers x and y except —00 and -l-oo (see 28, Chap. 2, Sect. 2.2]). The operation o g {-|-, — , x , -;-} but it is 
also valid for the square root. The relative rounding error unit is denoted by /i. In single precision, fj, ~ 2^^* 
and a = 2~^^^ and in double precision, /i — 2"^^ and cr = 2~^"^^. 



If f l(x o y) is in the normal range or if o G {+, — } then 62 is equal to zero. If fl{x o y) is in the subnormal 
range then ei is equal to zero. 

Numerical instabilities in programs come from the rounding representation of values and they also came 
from two problems due to finite precision: 

Absorption If |x| < /i|y| then it happens that fl(a; + y) ~ ^{y)- For example, in single precision, the result 
of fl(10^ — 10~^) is fl(10'*). In numerical analysis, the solution avoid this phenomenon is to sort the 
sequence of numbers [171 Chap. 4]. This solution is not applicable when the numbers to add are given 
by a sensor measuring the physical environment. 

Cancellation It appears in the subtraction fl(a; — y) if {\x ~ y\) < fi{\x\ + \y\) then the relative errors 
can be arbitrary big. Indeed, the rounding errors take usually place in the least significant digits of 
floating-point numbers. These errors may become preponderant in the result of a subtraction when the 
most significant digits of two closed numbers cancelled each others. In numerical analysis, subtraction 
of numbers coming from long computations are avoided to limit this phenomena. We cannot apply 
this solution in embedded control systems where some results are used at different instants of time. 

2.2 Interval Arithmetic 

We introduce interval arithmetic and in particular, the interval expansion of functions which is an element 
of our abstract domain FPS. 

^NaN stands for Not A Number. 

^The IEEE754-2008 standard introduces two rounding modes to the nearest with respect to the previous IEEE754-1985 and 
IEEE754-1987 standards. These two modes only differ when an exact result is in half-way of two floating-point numbers. In 
rounding-to-nearest-tie-to-even mode, the floating-point number whose the least significand bit is even is chosen. Note that this 
definition is used in all the other revisions of the IEEE754 standard, see 1281 Chap. 3.4] for more details. 





(1) 



f l(a; o y) = (a; o y)(l + ei) + £2 



with |ei| < /i and |e2| < — o" 



(2) 
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2.2.1 Standard Interval Arithmetic. 

The interval arithmetic ^27j has been defined to avoid the problem of approximated results coming from the 
floating-point arithmetic. It had also been used as the first numerical abstract domain in [S]. 

When dealing with floating-point intervals the bounds have to be rounded to outward as in [MJ Sect. 3]. 
In Example [21 we give the result of the interval evaluation in single precision of a sum of floating-point 
numbers. 

Example 2. Using the interval domain for floating-point arithmetic (241 Sect. 3] the result of the sum defined 
by EI=i + 10^ + EI=i + E1=i° 10"^ IS [IIIOO, 11101.953]. The exact result is lllOI while the 

floating-point result is 11100 due to an absorption phenomena. The floating-point result and the exact result 
are in the result interval but we cannot distinguish them any more. 

A source of over-approximation is known in the interval arithmetic as the dependency problem which is 
also known in static analysis as the non-relational aspect. For example, if some variable has value [a, 6], 
then the result of a; — a; is [a — b,b — a] which is equal to zero only if a = 6. This problem is addressed by 
considering interval expansions of functions. 

Notations. We denote by x a real number and by x a, vector of real numbers. Interval values are in capital 
letters X or denoted by [a, 6] where a is the lower bound and b is the upper bound of the interval. A vector 
of interval values will be denoted by X. We denote by [f] the interval extension of a function f obtained by 
substitution of all the arithmetic operations with their equivalent in interval. The center of an interval [a, b] 
is represented by micl([a, b]) = a + 0.5 x (6 — a). 

2.2.2 Extended Interval Arithmetic. 

We are interested in the computation of the image of a vector of interval X by a non- linear function f : M" — ^ M 
only composed by additions, subtractions, multiplications and divisions and square root. In order to reduce 
over-approximations in the interval arithmetic, some interval expansions have been developed. The first one 
is based on the Mean-Value Theorem and it is expressed as: 

f(X) C f{z) + [f'](X)(X - Vz G X . (3) 

The first-order approximation of tlie range of a function f can be defined thanks to its first order derivative 
f over X. We can then approximate f(X) by a pair (f(z), [f'](X)) that are the value of f at point z and the 
interval extension of f evaluated over X. 

A second interval expansion has been defined by Krawczyk and Neumaier |20| using the notion of slopes 
which reduced the approximation of the derivative form. It is defined by the relation: 

f(X)Cf(£) + [F^l(X)(X-£) 

with F''(X) = <^ — -^-.xeXAz^x^ . 

Then we can represent f(X) by a pair {f{z), [F^](X)) that are the value of f in the point z and the interval 
extension of the slope F^(A) of f. 

Note that the value z is constructed, in general, from the centers of the interval variables appearing in 
the function f for both interval expansions. 

An interesting feature is that we can inductively compute the derivative or the slope of a functions using 
automatic differentiation techniques [1]. It is a semantic-based method to compute derivatives. In this 
context, we call independent variables some input variables of a program with respect to which derivatives 
are computed. We call dependent variables output variables whose derivatives are desired. A derivative object 
represents derivative information, such as a vector of partial derivatives like (de/dxi, . . . ,de/dxn) of some 
expression e with respect to a vector x of independent variables. The main idea of automatic differentiation is 
that every complicated function f , i.e. a program, is composed by simplest elements, i.e. program instructions. 



4 



Knowing the derivatives of these elements with respect to some independent variables, we can compute the 
derivatives or the slopes of f following the differential calculus rules. Furthermore, using interval arithmetic 
in the differential calculus rules, we can guarantee the result. 

We give in Table [T] the rules to compute derivatives or slopes with respect to the structure of arithmetic 
expressions. We assume that we know the number of independent variables in the programs and we denote 
by n this number. The variable V'"'^ represents the vector of independent variables with respect to which the 
derivatives are computed. We denote by 6i the interval vector of length n, having all its coordinates equal to 
[0, 0] except the i-th element equals to [1,1]. So, we consider that all the independent variables are assigned 
to a unique position i in V'"'^ and it is initially assigned with a derivative object equal to 5i. Following 
Table [1] where g and h represent variables with derivative object, a constant value c has a derivative object 
equal to zero (the interval vector has all its coordinates equal to [0,0]). For addition and subtraction, 
the result is the vector addition or the vector subtraction of the derivative objects. For multiplication and 
division, it is more complicated but the rules come from the standard rules of the composition of derivatives, 
e.g. (u X v)' = u' X V + u X v' . A proof of the computation rule^ for slopes can be found in [SUl Sect. 1]. 
Note that we can apply automatic differentiation for other functions, such as the square root, using the rule 
of function composition, (/ o gy{x) = f {g{x))g' {x). 

These interval expansions of functions, using either (f(z), [f'](X)) the derivative form or (f(z), [f^](X)) 
the slope form, define a straightforward semantics of arithmetic expressions which can be used to compute 
bounds of variables. 



Table 1: Automatic differentiation rules for derivatives and slopes 



Function 


Derivative arithmetic 


Slope arithmetic 


c e M 

g + h 
g-h 
g X h 

g 




[g'](X) + [h'](X) 
[g'](X)-[h'](X) 
[g'](X)x h(X)+g(X)x [h'](X) 

[g'](X)xh(X)-[h'](X)xg(X) 




[G--](X) + [H--](X) 
[G--](X)-[r1(X) 
[G'"](X) X h(X)+g(z) X [r](X) 
[G^-](X)-[r](X)xf[| 


h 


h2(X) 

1 [g'](x) 


h(X) 

[G%X) 


^ \/g(X) 


^g{z) + ^giX) 



Remark 1. The difference in over-approximated result between the derivative form and the slope form is 
in the multiplication and the division rules. In the derivative form, we need to evaluate the two operands (g 
and h ) using interval arithmetic while we only need to evaluate one of them in the slope form. Note also that 
we could have defined the multiplication by [H^](X) x g(X) + h(z) x [G^](X) (the division has also two forms) 
but the two possible forms of slope are over- approximations off{X). Nevertheless, a possible way to choose 
between the two forms is to keep the form which gives the smallest approximation off{X). 

In Figure [1] we give two graphical representations of interval slope expansion. For this purpose, we want 



to compute the image of x by the function f{x) — x{l — x) We consider in Figure 1 (a) that x S [—1,1/2] 
and we get as a result that f{x) E [—1, 19/8] which is an over-approximation of the exact result [—1,5/4]. 
The midpoint is —1/4 and the set of slopes is bounded by the interval [0,9/4]. The dashed lines represent 



the linear approximation of the image. In Figure 1(b) [ we consider that x e [—1/2, 1/2] and the result is 



/(x) g [1/4, 7/4] which is still an over-approximation of the exact result [1/4, 5/4]. In that case, the midpoint 
is and the set of slopes is bounded by the interval [0,3/2]. Note that the smaller the interval the better 
the approximation is. 



^In [201 Sect. 2], the authors went also into detail of the complexity of these operations. 
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(a) X G [-1, 1/2] (b) X G [-1/2, 1/2] 

Figure 1: Two examples of the interval expansion with slopes 



Example [3] shows that we can encode with interval slopes the list of variables contributing in the result of 
an arithmetic expression. In particular, the vector composing the interval slope of the variable t represents 
the influence of the variables a, h and c on the value of t. For example, we know that a modification of 
the value of the variable a produce a modification of the result with the same order of the modification 
on a because the slope associated to a is [1,1]. But a modification on the variable b by A{, will produce a 
modification on the f by Af, x Vc because the slope of h is equal to Vc. 

Example 3. Let t = a + b x c, we want to compute the interval slope [T^](X) of t. We consider that 
ymd _ ^1 interval vector of the values of these variables. We suppose that the interval 

slope expansion of a, b and c are (2a,[A^](X) — Si), (zfc,[B^](X) ~ S2), and (zc,[C^]{jC) — 63) respectively. 
The interval value associated to c is Vc i.e. Vc = + [C^](X)(X — z). 

[r]{X) = [Al(X) + z,[c'm + [B^1(X) (z, + [C^1(X)(X - z)) 
= ([1, 1], 0, 0) + 2f, X (0, 0, [1, 1]) + (0, [1, 1], 0) X V, 

= ([1,1],[1,1]XV„Z6X[1,1]) 

= i[l,l],Y,,[zt,zt]) 

As seen in Example[3l interval slopes represent relations between the inputs and the outputs of a function. 
By computing interval slopes, we build step by step the set of variables related to arithmetic expressions 
in programs. In static analysis, we can use this interval expansion to track the influence of the inputs of a 
program on its outputs. Hence the choice of the set V'"'^ of independent variables is given by the set of the 
input variables of the program to analyse. Moreover, we can add in V'"'^ all the other variables which may 
influence output. 

3 Floating-Point Slopes 

We present in this section our new abstract domain FPS. In Section l3.ll we adapt the computation rules 
of interval slopes to take into account floating-point arithmetic. Next in Section 13.21 we define an abstract 
semantics of arithmetic expressions over FPS values taking into account the behaviors of floating-point arith- 
metic. And in Section l373l we deflne the order structure of the FPS domain. 
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3.1 Floating-Point Version of Interval Slopes 



The definition of interval slope expansion in Section [2.21 manipulates real numbers. In case of floating-point 
numbers, we have to take into account the round-off function and the rounding-errors. 

We show in Proposition [1] that the range of a non-linear function f of floating-point numbers can be 
soundly over-approximated by a floating-point slope. The function f must respect the correct rounding, 
i.e. the property of Equation ([2]) must hold. In other words, the result of an operation over set of floating- 
point numbers is over-approximated by the result of the same operation over floating-point slopes by adding 
a small quantity depending on the relative rounding error unit /i and the absolute error a. 

Proposition 1. Let f : D C M" R. be an arithmetic operation of the form g oh with o G {+, — , x , or 
i.e. f respects the correct rounding. For all X C D and zCzD, we have: 

fl(f(x)) c f(£)(i + [-^,^]) + [-|, |] + r]{x){x -z){i + • 



Proof. 

fl(f(X)) = {f(f )(1 + e^) + e:,:xet} by Eq. © 

C f(X) + f(X){e, : f G X} + {e, : f G X} 

C (f (£) + [F^^] (X)(X - z)) + {e. : X G X} by Eq. @ 

+ {f{z) + [F^] (X) (X - z)) {e, : X G X} 
Cf(2)(l + {e, : f G X}) +{e, : f G X} 

+ r](X)(X-z)(l + {e, :f GX}) 
C f{z){l + + [-f ' |] < M by Eq. m 

+ r](X)(X - z){l + hM,M]) < by Eq. © 

□ □ 

Remark 2. As the floating-point version of slopes is based on ii and a, we can represent the floating- 
point behaviors depending of the hardware. For example, extended precisioi\j is represented using the values 
/i — 2^^^ and a — 2^^^'*'*^. Furthermore following f^, we can compute the result of a double rounding with 
fi = (2" + 2)2-64 ^ ^ + 1)2-1086. 

Proposition [1] shows that we can compute the floating-point range of a function f, respecting the correct 
rounding, using interval slopes expansion. That is a set of floating-point values can is represented by a pair: 



[f](z)(l + [-M,M])+ r]iX){l + [-^^,^^])) 



The first element is a small interval rounding to the nearest around f{z) for which we have to take into 
account the possible rounding errors. The second element is the interval slopes which have to take account 
of relative errors. Note that this adaptation adds a very little overhead of computations compared to the 
definition of interval slopes by Krawczyk and Neumaier. 



^In some hardware, e.g. Intel x87, floating-point numbers may be encoded with 80 bits in registers, i.e. the significand is 64 
bits long. 

*It may happen on hardware using extended precision. Results of computations are rounded in registers and they are 
rounded again, with a less precision, in memory. 
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3.2 Semantics of Arithmetic Operations 



In this section, we define the abstract semantics of arithmetic operations over elements of floating-point 
slopes domain in order to mimic the behaviors of the floating-point arithmetic. We denote by I the set of 
intervals and by S = I x l'^ ' the set of slopes. An element s of S is represented by a pair (M, S) where M is 
a floating-point interval and S is a vector of floating-point intervals. We denote by (I, Cj, _Lj, Ti, Ui, Fli) the 
lattice of intervals. First we define some auxiliary functions before presenting the semantics of arithmetic 
expressions over FPS. 

The function l defined in Equation ([5|) computes the interval value associated to a floating-point slopes 
(M, S). We assume that the values of independent variables are kept in a separate interval vector Vyind. The 
notation mid(Vvind) stands for the component- wise application of the function mid on all the components of 
the vector Vyind . Note that • represents the scalar product. 



t((M,S)) = M-l-S- (Vy.nd - mid(Vv.nd)) 



(5) 



The function k defined in Equation ([5]) transforms an interval value [a, 6]^ associated to the i-th independent 
variable into a floating-point slope. 



K ^[a, 6]*^ = ([m, m], with m = mid([a,fe]) 



(6) 



This function k is used in two cases: i) To initialize all the independent variables at the beginning of an 
analysis, ii) In the meet operation, see Section [331 

We can detect overflows and generations of zero by using the function $ defined in Equation (O . We have 
two kinds or rules: total rules when we are certain that a zero or an overflow occur and partial rules when a 
part of the set described by a floating-point slope generates a zero or an overflow. With the function t we can 
determine for an element (M, S) € § if (M, S) represents an overflow or a zero. Hence we represent the flnite 
precision of the floating-point arithmetic. We denote by p^^ and by nioo the interval vectors with all their 
components equal to [-|-oo, -|-cx)] and [— oo,— oo] respectively. We recall that a is the smallest denormalized 
and E is the largest floating-point numbers. 



$(M,S) = < 



([0,0], 0) 
(M,0 U, S) 



([-|-oo,-|-oo],p„ 
(M,p^ U, S) 



([-oo,- 
(M,mc 



[(M,S) 



-oo], moo) 
Ui S) 



and M = 



and M 



and M 



if t(M,S) Cj [-f,f] 
if.(M,S) ni]-f,|[/_L, 

'[0,0] ifM E.]-f,f[ 

[0, 0] Ui M otherwise 
if i(M,S) Cj ]S,+oo] 
if t(M,S) n, ]E,+oo] /_L, 

[+oo,-foo] ifM [Zi]E,-|-oo] 

[+00, -f oo] Uj M otherwise 
if i(M,S) En [-oo,-E[ 
if i(M,S) n, [-oo,-E[/_Li 

-oo,-oo] ifM Cj [-oo,-E[ 

[— oo, — oo] Ui M otherwise 

otherwise 



(7) 



Equation ([7]) is an adaptation of the rule deflned in Equation ([T]) to deal with FPS values. Furthermore, 
the abstract values (-|-oo,poo) and (— oo,moo) represent the special floating-point values -l-oo and — oo 
respectively. As in floating-point arithmetic, the values (-|-oo,poo) and (— oo,moo) are absorbing elements. 

An interesting feature of interval slopes is that we can mimic the absorption phenomenon by setting 
to zero the interval slope of the absorbed operand. We define the function p for this purpose. Indeed, an 
abstract value (M, S) already supports partial absorption as M is computed with a rounding to the nearest 
but S have to be reduced to represent the absence of the influence of particular independent variables. The 
reduction of an abstract value g = (Mg, Sg) compared to an abstract value h = (M/j, Sh), denoted by p{g \ h), 
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is defined in Equation ([5]). 



p{g \h)= < 



([0,0], 0) if (.(Mg.Sj) ci [^Ji,^l\ x /.(Mft,Sh) 

(Mg, Ui Sj) if (.(Mg, Sj) ni [/i, ^] X /,(Mh, Sh) / ±1 

[0,0] if Mg c, X t(Mh,Sh) 



and Mg = 



otherwise 



[0, 0] Ui M„ otherwise 



(8) 



Equation (jSj models the absorption phenomenon by explicitly setting to zero the values of a slope. As 
mentioned in Section [2. 2[ a slope shows which variables influence the computation of an arithmetic expres- 
sion. But, absorption phenomena induce that an operand does not influence the result of an addition or a 
subtraction any more. 

Using the functions $, p and i, we inductively define on the structure of arithmetic expressions the 
abstract semantics |.]g of floating-point slopes in Figure[2] We denote by env" an abstract environment which 
associates to each program variable a floating-point slope. For each arithmetic operation, we component- 
wisely combine the elements of the abstract operands [g]s(env'*) = (Mg,Sg) and (env") = (M;i,Sft). 
The element M is obtained using the interval arithmetic with rounding to the nearest. The element S is 
computed using the definition of the slope arithmetic defined in Table [TJ We take into account of the possible 
rounding errors in the result (M, S) following Proposition [TJ In case of addition and subtraction, according 
to the Equation ([5]), we do not consider absolute error ^ which is always zero. Moreover, in case of addition 
or subtraction, we handle the absorption phenomena using the function p, defined in Equation ([5]). Finally, 
we check if a zero or an overflow is generated by applying the function $ defined in Equation ([7]). 



h ± h\i = $ ((M, ± M,0(1 + P\), if, ± 1) (1 + [-/i,M])) 

with (Mg, Sg) = p{g I h) and (M,,, Sh) = p{h \ g) 
[5Xft]«(e«) = $(M, (Sg X .(M;„S,0 + Mg X S,0(1 + [-m,m])) 



fi:<»'' 



with M = (M9 xMft)(l +[-/!, Ai])+ 



t(Mfe,Sh) 



with M^^(i+[-,,,]) + g,g 

^ t(Mh,Sft) and ^ M;, 



with M= (yM;(l+[-M,M])) + [-|,|] , 

Mg n, [-00, 0] = _Li and i{Mg,Sg) Uj [-oo, 0] = _Li 



Figure 2: Abstract semantics of arithmetic expressions on floating-point slopes 



Remark 3. The functions $ and p make the arithmetic operations on floating-point slopes nan associative 
and non distributive as in floating-point arithmetic. 
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3.3 Order Structure 



In this section, we define the order structure of the set § of floating-point slopes. In particular, this structure 
is based on the lattice of intervals. We recall that the set of slopes § = I x ll^ I and an element s of S is a 
pair (M,S). 

We define a partial order, the join and the meet operations between elements of S. All these operations 
are defined as a component-wise application of the associated operations of the interval domain except the 
meet operation which needs extra care. We denote by Qj the component-wise application of the interval 
order. We can define a partial order Cg between elements of § with: 

V(Mg,Sg),(Mft,Sh) G S, (Ms,S<,) Cs [Mh,Sh) ^ C, M„ A t, . (9) 

The join operation Ug over floating-point slopes is defined in Equation (|10p. We denote by Ui the 
component-wise application of the operation Ui. 

V(M9,Ss),(M,„Sh) G §, {M,,Ss) Us (Mh,Sh) = (M,S) 

with M = Mg Ui and 5 = 89^1 Sh (10) 

There is no direct way to define the greatest lower bound of two elements of §. Indeed, two abstract 
values may represent the same concrete value but without being comparable. Hence we only have a join- 
semilattice structure. The meet operation rig over floating-point slopes is defined in Equation ([TT|) . It may 
require a conversion into interval value. We consider that the result of the meet operation introduces a new 
independent variable at index £. We denote by \Zi the strict comparison of intervals and by _Lg the least 
element of §. 

V(Mg,Ss),(Mh,SO G §, {Mg,S,) Us (M;,,Sh) = 

'-Ls if t(Mg,S9)nit(Mfc,S,0 = -Li 

^ (M3,S,) if t(M„S3)Ui^(Mft,l) 

' (Mh,Sh) if t(Mh,Sh) ^^(M,,,^^) 

.K(t(M;i,Sh) Uj i,(Mg,Sg)) otlierwise 

Note on the Widening Operator. 

In order to enforce the convergence of the fixpoint computation, we can define a widening operation Vg over 
floating-point slopes values. An advantage of our domain is that we can straightforwardly use the widening 
operations deflned for the interval domain denoted by Vi. We define the operator Vg in Equation (|12|) using 
the widening operator between intervals. The notation Vi represents the component-wise application of Vi 
between the components of the interval slopes vector. 

V(Mg,Sg),(M^,Sft) G §, (Mg,Sg)Vs(Mh,Sh) = (M,S) 

with M = Mg Vi Mh and S = Sg Vi Sh (12) 

4 Analysis of Floating-Point Programs 

The goal of the static analysis of floating-point programs using the floating-point slopes domain is to give 
for each control point and for each variable an over-approximation given by FPS of the reachable set of 
floating-point numbers. An abstract environment env' associates to each variable v € V a, value of S. The 
set V is made of the sets V'"'^ and V^^p of independent and dependent variables. 

The semantics of an assignment |w := el" in the abstract environment env" is the update of the value 
associated to v with the result of the evaluation of the arithmetic expression e using the arithmetic operations 
over FPS given in Figure [2l As the FPS domain is related to the interval domain we can straightforwardly 
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use the semantics of tests given in [IS] to refine the value of variables. Note that the semantics of tests is 
related to the meet operation defined in Equation (jlip which may conserve some relations between variables. 

We define in Equation the concretization function 7s between the join-semilattice (V — >■ §, Q§), with 
C§ the point-wise lifting comparison, and the complete lattice (p(V 

7s(t;H> (^M,s)) = y {t;K^iGl:I = M + S- (^M-mid(Vv,„d))} (13) 

In Theorem [1] we state the soundness of the floating-point analysis using FPS domain with respect to the 
concrete floating-point semantics. The later is based on the concrete semantics of floating-point expressions 
[e], see [23] for its definition. 

Theorem 1. // the set of concrete environments env is contained in the abstract environment env^ then we 
have for all instruction i representing either an assignment or a test: 

|il(en^) C7s(|j]''(e™")) . 



5 Case Studies 

In this section, we present experimental results of the static analysis of numerical programs using our 
floating-point slope domain. We based our examples on Matlab/Simulink models which are block-diagrams. 
We present as examples a second order linear filter and a square root computation with a Newton method. 

We first give a quick view of Matlab/Simulink models. In a block-diagram, each node represents an 
operation and each wire represents a value evolving during time. We consider a few operations such that 
arithmetic operations, gain operation that is multiplication by a constant, conditional statement (called 
swifc/j^ in Simulink), and unit delay block represented by ^ which acts as a memory. We can hence write 
discrete-time models thanks to finite difference equations, see [3] for further details. 

The semantics of Simulink models is based on finite-time execution. In other words, a Simulink model 
is implicitly embedded in a simulation loop modelling the temporal evolution starting from t = to a given 
final time tend. The body of this loop follows three steps: i) evaluating the inputs, ii) computing the outputs. 
Hi) updating the state variables i.e. values of the unit delay blocks. The static analysis of Simulink models 
transforms the simulation loop into a fixpoint computation. In its simple form, see [3] for further details, we 
add an extra time instant to collect all the behaviors from tend to t = -t-00. 



Linear Filter. 

We applied the floating-point slope domain on a second order linear filter defined by: = a;„ + 0.7x„_i + 

-I- 1.2yn-i - 0.7j/„-2 . 

The block-diagrams of this filter is given in Figure |3(a)[ We consider a simulation time of 25 seconds 
that is we unfold the simulation loop 25 times before making unions. The input belongs into the interval 
[0.71, 1.35]. The output of the filter is given in Figure [3 (b)[ We consider, in this example, that V"*^ contains 
the input and the four unit delay blocks that is there are five independent variables. The gray area represents 
all the possible trajectories of the output corresponding of the set of inputs. Hence we can bound the output, 
without using the widening operator, by the interval [0.7099,9.8269]. 



Newton Method. 

We applied our domain on a Newton algorithm which computes the square root of a number a using the 
following iterative sequence: Xn+i = ^ + jf- ■ 

We want to compute that is we consider the result of the Newton method after five iterations. The 
Simulink model is given in Figure 4(a) and in Figure [4(b) [ we give the model associated to one iteration of 

^This operation is equivalent to the conditional expression: if pc(eo) then ei else 62. The predicate pc has the form eo o c 
where c is a given constant and o G {>, >, 7^}. 
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(a) Simulink model 



Figure 3: Second order linear filter 




Gain 



(a) Main model (b) Content of a subsystem 

Figure 4: Simulink model of the square root computation 



the algorithm. In this case, the set V'""^ is only made of one element. For the interval input [4, 8] with the 
initial value equals to 2, we have the result [1.8547,3.0442]. 



6 Related Work 

Numerical domains have been intensively studied. A large part of numerical domains concern the polyhedral 
representation of sets. For example, we have the domain of polyhedron 10 and the variants |32J 
I5T1 m HH im ISl n\- we also have the numerical domains based on affine relations between variables 
[191 112) or the domain of linear congruences [161 . In general, all these domains are based on arithmetic with 
"good" properties such that rational numbers or real numbers. A notable exception is the floating-point 
versions of the octagon domain |24j and of the domain of polyhedron |5]. These domains give a sound 
over-approximation of the floating-point behaviors but they are not empowered to model the behaviors of 
floating-point arithmetic as we do. 

Our FPS domain is more general than numerical abstract domains made for a special purpose. For 
example, we have the domain for linear filters jllj or for the numerical precision [14 which provide excellent 
results. Nevertheless as we showed in Section [5l we can apply this domain in various situations without 
losing too much precision. 
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7 Conclusion 

We presented a new partially relational abstract numerical domain called FPS dedicated to floating-point 
variables. It is based on Krawczyk and Neumaier's work |20| on interval expansion of rational function using 
interval slopes. This domain is able to mimic the behaviors of the floating-point arithmetic such that the 
absorption phenomenon. We also presented experimental results showing the practical use of this domain in 
various contexts. 

We want to pursue the work on the FPS domain by refining the the meet operation in order to keep 
relations between variables. Moreover we would like to model more closely the behaviors of floating point 
arithmetic, for example by taking into account the hardware instructions [261 Sect. 3]. 

As an other future work, we want to apply FPS domain for the analyses of the numerical precision by 
combining the FPS domain and domains defined in [221 H]- An interesting direction should be to make an 
analysis of the numerical precision by comparing results of the FPS domain and results coming from the 
other numerical domain which bound the exact mathematical behaviors such that [5]. Hence we can avoid 
the manipulation of complex abstract values to represent rounding errors such as in [331 (111 H] ■ 
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